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SUMMARY 


TAIL  BEHAVIOUR  FOR  SUPREMA  OF  EMPIRICAL  PROCESSES 


//»  *  V 

We  considejN^mlti-variate  empirical  processes  XQ(t)  :  **  *{T(F  (t)-F(t)), 

where  Ffl  ie  an  empirical  distribution  function  bat ed  on  i.i.d.  variables 

k 

with  distribution  function?, F,  and  tcR  .  For  X^  the  weak  limit 

of  X  ,  it  is  shown  that 
n 

c(F,k)X2(k_1)e~2x2  <  P{sup  X  (t)  >  X}  <  C(k)X2(k_1)e“2X2 

t  F 

for  large  X  and  appropriate  constants  c,C.  When  k  ■  2  these 
constants  can  be  identified,  thus  permitting  the  development  of 


Kolmogorov-Smirnov  tests  for  bivariate  problems.  For  general  k 
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1.  INTRODUCTION 

It  is  well  known  that  the  Kolmogorov-Smimov  (KS)  statistic, 

based  on  a  sample  from  any  univariate  random  variable  with  continuous 

distribution  function  (d.f.),  is  distribution  free.  It  is  also  well 

known  that  in  the  multivariate  situation  this  is  not  the  case,  and 

it  is  to  this  situation  that  we  shall  soon  direct  our  efforts.  In 

the  beginning,  however,  Kolmogorov  (1933)  showed  that  the  one-sided 

statistic,  T  ■  sup{*4T(F  (x)  -  F(x)):  xeR^},  where  F  denotes  the 
n  n 

underlying  d.f.  and  F  the  empirical  d.f.,  satisfies 

n 

-2  A2 

(1.1)  P{Tn  >  A}  e  VA,  as  n  -*•  »  . 

Smirnov  (1944)  extended  this  result  to  the  two-sample  problem. 

Feller  (1948)  gave  it  a  neater  proof,  and  Doob  (1949)  followed  by 

Donsker  (1951,  52)  and  the  theory  of  weak  convergence  explained  it 

in  terms  of  the  convergence  of  ^(Fn-F)  to  a  limiting  Gaussian  process 

whose  maximum  had  the  tail  distribution  exp(-2l2). 

In  the  multivariate  case,  there  is  no  simple  analogue  to  (1.1), 

and  the  best  one  can  hope  to  obtain  is  either  a  limiting  distribution 

for  some  specific  F,  or  bounds  that  may  be  valid  for  a  family  of 

F's  sharing,  perhaps,  some  regularity  properties.  The  first  attack 

on  this  problem  was  made  by  Kiefer  and  Wolfowitz  (1958),  who  showed 
(k) 

that  if  T  is  the  one-sided  KS  statistic  in  k  dimensions,  then 


for  some 


o  ■  o(k)  >  0  and  c*c(k)  < 


9 


2 


(1.2)  P(T^k>  >  X)  <_  ce"aX2  Vn,  x,  F. 

Despite  the  fact  that  this  bound  is  obviously  very  crude,  it  did  at 

least  suffice  to  prove  the  existence  of  a  limiting  distribution  for 

as  n  -*■  «®  .  (The  full  weak  convergence  of  the  empirical  d.f. 
n 

to  an  appropriate  limiting  Gaussian  random  field  was  later  established 
by  Dudley  (1966,  67).)  However,  although  Kiefer  and  Wolfowitz  esta¬ 
blished  the  existence  of  this  limiting  distribution,  no  explicit  form 
for  it  is  known.  Indeed,  there  is  only  one  non-trlvial  case  where 
reasonably  accurate  bounds  are  known,  this  being  the  case  where  F 
is  uniform  on  the  unit  square.  Here  the  limiting  distribution  of 
i/n(Ft  -F)  is  that  of  a  pinned  Brownian  sheet,  and  fairly  close  lower 
and  upper  bounds  on  the  distribution  of  its  maximum  appear  in  Goodman 
(1976)  and  Cabana  and  Wschebor  (1982),  respectively.  We  shall  have  more 
to  say  on  this  later,  when  Goodman's  lower  bound  is  extended  to  arbi¬ 
trary  dimensions. 

In  a  classic  paper,  Kiefer  (1961)  greatly  improved  on  (1.2) 
and  showed  that  for  all  e  >  0  there  is  a  c  ■  c(k,e)  such  that 

(1.3)  P{T<k)  >  X}  ±  ce'2(1~E)*2  Vh,  X,  F  . 

This  is  a  particularly  interesting  bound  since,  viewed  as  a  result 

on  the  maximum  of  the  limiting  Gaussian  field,  rather  than  as  a 
(k) 

result  on  T^  itself,  it  is  one  of  the  few  fore-runners  of  the 
general  inequality  for  continuous  Gaussian  processes,  X(t) ,  that 
states  that  for  all  sufficiently  large  1 

(1.4)  Pfsup  X(t)  >  X}  e~a*2  ,  V  a  <  (2o2)-1  , 

t 
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where  o2  ■  sup(var  X(t)).  (Femique  (1970,  71),  Landau  and  Shepp 
t 

(1971),  Marcus  and  Shepp  (1971).)  Note  that  since  sup{var[*'n(F  (x)-F(x))])  *  1/4, 

x  n 

Kiefer's  bound  Is,  today,  a  simple  consequence  of  (1.4)  and  weak  conver¬ 
gence.  Nevertheless,  In  its  time,  Kiefer's  result  was  of  substantial 
interest,  since  it  was,  to  the  best  of  our  knowledge,  the  first  time 
that  a  uniform  bound  was  placed  on  the  maxima  of  a  large  family  of 
Gaussian  processes.  (The  statistical  significance  of  such  a  lower 
bound  is  that  it  permits  construction  of  "confidence  intervals"  for  an 
unknown  F.)  Furthermore,  Kiefer  exploited  (1.3)  to  prove  a  law  of  the 
iterated  logarithm  (LIL)  for  the  multivariate  KS  statistic. 

The  main  thrust  of  the  current  work  will  be  to  further  refine 
(1.3),  in  two  directions,  and  then  to  investigate  the  consequences 
of  the  refinement.  For  a  start,  we  shall  show  (Section  4)  that  (1.3) 
can  be  replaced  by:  There  is  a  c  *  c(k)  such  that 

(1.5)  P(T<k)  >  A}  <  cX2(k"1)e‘2X2  ,  V  n,  X,  F. 

This,  as  with  Kiefer's  result,  is  of  interest  beyond  the  KS  situation, 
since,  in  the  Gaussian  process  setting,  it  provides  a  family  of  pro¬ 
cesses  for  which  (1.4)  can  be  improved  upon.  However,  we  can  do  better 
than  just  (1.5),  and  we  shall  also  show  that  as  long  as  F  satisfies 
mild  regularity  conditions,  there  is  a  c  *  c(F)  such  that 

n  vt T(k>  .  n  „  (k-1)  -2X2 

vl-oj  r{T  >  A}  >  cA  e  ,  ¥  n,  X. 

n  — 

The  upper  and  lower  bound  together  enable  us  to  improve  on  Kiefer's 
LIL,  and  to  obtain  an  exact  upper-lower  class  result  in  its  place 
(Section  5) . 
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An  upper  bound  similar  in  spirit  to  (1.5)  has  recently  been  obtained 
by  Alexander  (1983) .  Treating  a  more  general  situation  of  empirical 
measures,  ,  Indexed  by  a  Vapnik-Cervonenkis  class  of  functions,  F  say, 
he  showed  that 

vol2 

P{sup|v  (f)  1  >  X}  _<  16X  .  e  ,  V  X  8  , 

feF  n 

where  v  is  a  strictly  positive  integer  describing  the  "size"  of  F . 
Alexander’s  result,  while  clearly  being  an  improvement  on  (1.3), 
also  gives,  for  the  cases  we  consider,  an  enormous  over-estimate  of 
the  power  of  X  in  the  upper  bound. 

Unlike  Alexander,  however,  we  shall  have  little  to  say  about  the 
sizes  of  the  constants  in  our  bounds,  other  than  to  guarantee  their 
finiteness.  Thus,  from  the  point  of  view  of  actually  applying  the 
KS  statistic  in  a  statistical  setting,  these  results  are  of  limited 
interest.  We  shall  remedy  this  situation  in  Section  3,  where,  for  the 
two-dimensional  case,  we  shall  develop  an  explicit,  sharp,  upper  bound, 
and  a  reasonable  lower  bound.  The  various  applications  of  these  results 
are  spelled  out  in  detail  in  Brown  and  Adler  (1984).  The  argument 
leading  to  the  upper  bound  is  rather  interesting,  since  it  is  based 
on  finding  the  worst  possible  F  (a  task  actually  performed  by 
Kiefer)  and  comparing  it,  via  Slepian's  (1962)  inequality,  to  all  other 
cases.  The  distribution  of  the  maximum  in  the  worst  possible  case  is 
what  then  provides  the  bound.  In  fact,  this  methodology  of  "comparison" 
will  also  be  used  to  obtain  the  lower  bound  (1.6),  and  may,  in  a  certain 
sense,  be  considered  the  main  methodological  theme  of  this  paper. 

The  following  section  is  devoted  to  peripheral  and  support 
material  There  we  obtain  lower  bounds  for  the  distribution  of  the  maximum 
of  the  pinned  Brownian  sheet  in' k-dimens ions,  and  some  related  distributions. 
While  these  do  have  some  intrinsic  interest,  our  main  interest  in  them 


will  arise  from  their  usefulness  as  "comparison  distributions".  We 
close  this  section  with  notation  and  some  background  results. 

Let  X^,  X£»  ...»  be  independent  random  variables  with  d.f. 


F(x),  which  we  assume  to  be  continuous,  and  which  can  therefore, 
w.l.o.g.,  be  taken  to  be  concentrated  on  the  unit  cube  I  »  [0,l] 
of  with  univariate  marginals  uniform  on  [0,l]  .  We  denote  a 


,k 


point  in  I  by  either  x  or  (x^,  . ..,  x^)  and  introduce  the 
usual  partial  order. 


x  <  y 


<  y  ,  i=l, . . . ,k  , 


x,ycl 


For  x  <  y  we  write  [x,y]  for  the  set  x  [x  »y  ],  and  use 

•  1-1  J 

1.(0  for  the  indicator  function  of  the  set  AC  I  .  Thus  we  can 
A 

formally  introduce  the  empirical  d.f.  as 


a.?)  Fn<*>  =  -  -'Wco.joy  • 


Let  Wj,  be  the  pinned  Brownian  sheet  based  on  F;  i.e.,  the  zero 
mean  Gaussian  process  with  covariance  function 


k 


(1.8)  RyCx.y)  -  E{Wp(x)WF(y) }  -  F(x*y)  -  F(x)F(y),  x,yel 

where  x/\y  is  the  coordinatewise  minimum  (x^Ay^,  ...,  x^y^)  • 
Then,  as  is  well  known,  (k»l,  Donsker  (1952);  k>l,  Dudley 
(1966,  67)),  i^TCF^-F)  converges  weakly  to  WF  in  the  space  of  all 
bounded  functions  on  I  .  Thus,  in  particular,  if 

(1.9)  ■  T^(F) :  «  supt^CF  (x)  -  F(x)):  xel^} 

n  n  n 

is  the  one-sided  KS  statistic,  then 


v.v.  v.  •*, 


(1.10)  T*(F)  -*■  Mj,:  -  sup{Wp(x) :  xel  } 


as  n 


-V-  00 


This  last  result  provides  the  obvious  motivation  for  the  next 
two  sections,  both  of  which  are  concerned  with  the  distribution  of 
Mj,  .  In  fact,  one  can  go  beyond  the  central  limit  result  (1.9) 
to  a  much  stronger  embedding  type  result.  However,  since  we  shall 
not  need  this  result  until  Section  5,  we  shall  introduce  it  only 


2.  TWO  SPECIAL  CASES 


We  consider  firstly  the  distribution  of  when  F  is  the 

uniform  distribution,  U  say,  on  I  .  We  shall,  however,  require  a 

(k) 

slightly  more  general  result  later,  and  to  this  end  let  W  denote 

Jc 

the  (unpinned)  Brownian  sheet  on  I  ,  i.e.,  the  zero  mean  Gaussian 
process  with  covariance  function 

E{W(k)(x)W(k)(y)>  **  n  (x  Ay.),  x,yeRk, 

i«l 

o(k)  (k)  k 

and  write  W  for  the  pinned  version  of  W  on  I  .  Then  a 

o 

version  of  W  can  be  obtained  from  W  by  the  correspondence 

(2.1)  W(k)(x)  *  W(k) (x)  -  |x|W(k)(l)  xeik  , 

k 

where  |  x  |  «  ir  x,  .  In  the  general  notation  of  the  previous  section 
o(k)  1=1 

W  =  Wy  .  The  result  we  shall  need  is 
Theorem  2.1 

k-1 

(2.2)  P{supW(k)(x)  >  x|W(k)(l)  -w}>e’2i(W)  Z  [2 A(A-v)]n/(n!)  , 

jk  n*0 

a. 8. ,  for  all  X  >  w.  Furthermore,  the  case  w  *  0  yields 

(2.3)  P{supW(k)(x)  >  X}  >  e'2X2  E  (2X2)n/(n!)  . 

jk  n»0 

When  k»l  (2.2)  follows  immediately  from  the  reflection  principle 
(Feller  (1971)).  When  k»2,  (2.3)  is  given  explicitly  in  Goodman  (1976) 
and  (2.2)  is  also  there  implicitly.  Cabana  and  Wschebor  (1982)  and 
Park  and  Skoug  (1978)  also  have  (2.2)  in  the  two  dimensional  case,  and 
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shortly  after  obtaining  the  above  result  for  general  k,  we  received 
a  copy  of  Cabana  (1982)  which  states  the  same  result  with  a  virtually 
identical,  albeit  more  detailed,  proof.  However,  since  Cabana's 
paper  is  not  readily  available,  as  well  as  for  the  sake  of  complete¬ 
ness,  we  shall  give  a  brief  proof  of  the  theorem. 


Proof.  The  proof  proceeds  by  induction.  As  noted  above,  (2.2)  is 


known  to  be  true  when  k=l  and  k=2.  Now  write 


ak(X,w) 


for  the 


conditional  probability  on  the  left  in  (2.2),  and  define 


Then,  after  some  calculation,  it  readily  follows  from  (2.1)  that 
^(Xjw)  «*  h^(X,-w).  If  we  now  follow  the  formulation  of  Goodman  (1976) 
of  treating  the  (k+1) -parameter,  real  valued,  W(x)  as  a  C0[0,l]k 
valued,  single  parameter  process,  then  by  applying  Goodman's  Theorem  2 
and  mimicking  his  manipulations  on  page  980,  it  is  straightforward  to 
establish  the  relation 

«k+i<\w)  >  /  (1-e  (w-u) )  h^ ( X,  du)  . 

Exploiting  the  above  relationship  between  a^  and  h^  thus  yields 
a  recurrence  formula  for  ajc+^  »  and  it  is  now  a  matter  of  elementary 
calculus  to  check  the  induction  hypothesis  and  so  complete  the  proof. 

The  second  result  which,  unlike  Theorem  2.1,  is  of  little  inde¬ 
pendent  interest,  will  be  extremely  useful  for  us  later.  To  state 
it,  we  Introduce,  for  eachee(0,y,  the  d.f.  F£(x)  «.  Fc(xlt. . .  ,3^) 
defined  by 
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(2.5)  Fe(x) 


min(x.) 
i  1 


*5  -  c  +  (2c) 


if  I min (x  ) -h I  >  e, 
i  1 


'  *  C(x  -*ffe)A2e]  otherwise. 
i-1  1 


Despite  its  somewhat  forbidding  appearance,  F£  is  a  rather  simple 
d.f.,  distributing  total  probability  2e  uniformly  on  the  cube 
A£:  *»  [%-e,  with  the  remaining  probability  of  l-2e  distributed 

uniformly  on  that  part  of  the  main  diagonal  of  I  disjoint  from  A^. 

Now  let  ){ie(u,v)  denote  the  two-dimensional  normal  density 
with  zero  means  and  covariance  matrix  Z  defined  by 


(2.6) 


(%-e)  (5s+e)  (h~t)  ‘ 


k-zY 


(k-z)  0&z) 


Furthermore,  let  Z  be  the  matrix  identical  to  Z  ,  but  with  the 
~e  —  e 

sign  of  the  off-diagonal  entries  reversed,  and  let  be  the  two- 
dimensional  normal  density  with  zero  mean  and  covariance  matrix  Z  . 
For  each  positive  e  and  x,  and  integral  k,  set 


2eA  2eA(k-i  \  . 

(2.7)  *.  _(*):-/  /  E  [(u-2eA)(v-2  cX)/ef  /n!  U(u, 

-»  -»  (ja«0  J 

and  write  Q  (A)  for  the  normal  quadrant  integral 


(2.8)  (A):  ■  1  -  J  J  i^(u,v)dudv  . 


v) dudv. 


We  can  now  state 


Theorem  ?.,2.  For  every  X  >  0 

(2.9)  P (sup  W  (x)  >  *>  2.  Qe(X)  +  e"2x2^  £(X)  . 

jk  e  ’ 


In  particular,  there  is  a  finite  c  ■  c(E,k)  ba»ih  that  for  all  X  >  0 


(2.10)  P(sup  W  (x)  >  X}  >  cX2^*  "^e  2 
*k  £ 

Proof.  We  shall  obtain  only  a  lower  bound  for  P{sup(Wp  (x):xeAe)>  X}t 
which,  a  fortiori,  will  provide  the  lower  bound  required.  Let  a£  and  b 
be  the  two  extreme  comers  of  A£,  i.e.,  aE  ■  Ps-e,...,  %-e)  , 
b£  *»  (^e  , . . .  ,Js+e)  .  Then  define  the  process  Z(x)  on  I  by 


Z(x) :  -  (2e)-,sWF  (ae+2€x)  -  (l-|x|)WF  (a£)  -  |*|wp  (b£) } 


Then  it  is  straightforward  to  check  that  Z(x)  is  a  standard  pinned 
sheet  on  I  ,  as  in  (2.1).  Consequently,  for  (u,v)  <  X,  it  follows 


that 


(2.11) 


P(sup  W  (x)  > 
F* 

(x) 

-  p{sup  (2 - 

,k 


X  I  VTUJ  - 
-  (i-u)/v5T  } 


u,  wF  (be) 
e 


v> 


But  tlis  is  precisely  the  probability  defined  at  (2. A).  Thus,  using 

the  equivalence  noted  there  between  this  probability  and  a^»  we 

can  bound  it  by  Theorem  2.1.  Using  this  bound,  (2.11),  and  the  fact 

that  the  joint  density  of  <W_  (a  ),  W_  (b  )>  is  given  by  ^  , 

e  e 


we  obtain 


(2.12)  P(sup  W_  (x)  >  iP^W  (a  )  >  \  or  W_  (bj  >  \]  + 
xk  Fe  Fe  Fe 

/»/Ae-(X-„)(X-v)/eh1[(1-u)(1.v)/t]»/n||  „  (u>v)dudv  . 

-®  -«  (n=0  J 

Consider  the  Integrand,  and  make  the  transformations  x=u-X(l-2e  )  > 
y*=v-X(l-2e)  .  Tedious  but  straightforward  algebra  yields  that  it  is 
equivalent  to 

-912.  l-l 

e  /A  iji  (x,y)  E  [(x-2eX)  (y-2eX)/e]  /n! 

E  n-0 

Substituting  this  into  (2.12),  changing  the  bounds  on  the  integral, 
and  replacing  the  rightmost  probability  by  Q£(l)  now  yields  (2.9), 
as  required. 

To  obtain  (2.10)  from  (2.9)  simply  take  X  large  enough  so  that 
the  dominant  term  in  the  sum  in  is  0(X^^  ^ )  .  Then  choose 

an  appropriate  c  to  make  (2.10)  work.  This  completes  the  proof. 

In  what  follows  wt  shall  be  primarily  interested  in  the  asymptotic 
lower  bound  (2.10),  which  will  be  used  to  prove  results  of  theoretical 
Interest.  The  explicit  expression  (2.9)  has,  however,  some  practical 
value  for  statistical  hypothesis  testing,  and  this  is  discussed  in 
Brown  and  Adler  (1984) ,  where  the  bound  is  actually  tabulated  for  a 
number  of  cases. 

In  general,  we  shall  use  Theorem  2.2  to  form  a  basis  for  comparison 
between  the  maxima  of  pinned  sheets  based  on  different  d.f.s.  The 
crucial  result  that  underlies  all  these  comparisons  is  a  basic  result 
of  Slepian  (1962),  which  we  record  here  as 


V  t  e  T,  and 


cesses  defined  over  some  set  T.  If  var  X(t)  ■=  var  Y(t), 

(2.13)  cov(X(t),  X(s))  1  cov(Y(t)  ,Y  (s) )  Vs,  t£T  . 

then 

(2.14)  P{sup  X(t)  >  X}  j^P{sup  Y(t)  >  X}  VX  . 

T  T 

Note  that  Sleplan's  inequality  does  not  extend  to  comparisons  of 
jsup  x|  and  ]sup  y|,  and  so  the  sharp  results  of  the  following  section 
are  not  easily  extendable  to  the  two-sided  KS  statistic.  Nevertheless 
we  can  always  use  the  fact  that  for  symmetric  processes 

(2.15)  P{sup  X  >  X}  <_  P{sup  |x|  >  X}  <_  2P{sup  X  >  X} 

to  obtain  bounds  for  the  two-sided  case.  For  the  bounds  of  section  4, 
in  which  constants  are  not  identified,  this  is  clearly  sufficient. 

We  now  consider,  as  an  example  of  our  "comparison  methodology" 
the  two-dimensional  case. 


THE  TWO-DIMENSIONAL  CASE 


Throughout  this  section,  we  shall  denote  points  in  I  by 
(x,y)  ,  and  F  will  denote  a  continuous  d.f.  on  1^  possessing  uniform 
marginals.  The  degenerate  distribution,  uniform  on  the  negative 
slope  diagonal  x+y=l  will  be  denoted  by  G(x,y) ;  i.e., 

(3.1)  G(x,y)  =  (x+y-l)  +  ,  (x,y)el2  . 


Our  aim  in  this  section  will  be  to  devise  good  (non-asymptotic) 
bounds  for  P{sup  >  X  }  .  We  start  with 


Iheorem  3.1  For  any  two-dimensional  d.f.  F  satisfying  the  above 


conditions,  and  for  any  X  >  0 


(3.2)  P(sup  W  (x)  >  X}  _<  P{sup  W_(x)  >  X}  . 


Furthermore 


(3.3)  P{sup  W  (x)  >  X>  <.  r  (8n2X2-2)e 


-2n2X2 


2  2 

Proof.  Let  m  be  the  mapping  from  I  into  I  defined  by 


(3.4)  G(m(x))  ■  G(m^(x)  ,  m^x))  »  F(x) ,  V  x  e  I'' 


(3.5)  m^x)  -  m^(x)  -  *2  “  X1 


Vxer  . 


We  must  check  that  m  is  well  defined.  For  given  x,  note  that  m(x) 
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lies  on  the  line  ix:  -  +  y,x2  +  y),  ye  R,  and  G (xx  +  y,x2  +  y) 

is  clearly  non-decreasing  in  y  .  Indeed,  (3.1)  and  a  little  elementary 
geometry  show  that  G(x^  +  P,x2  4-  y)  is  strictly  increasing  for 


(3.6)  ^(l-Xj-xp  <  y  <  1  -  (x1  v  x2) 


When  y  «■  ^(l-Xj-Xj) ,  then  G(x1+y,x2+y)  «  0  <  F(x) .  When  y  -  1-^  v  , 
then  G(Xj+y,  x2+y)  *»  1  -  -  x2|.  Suppose,  w.l.o.g.,  that  x^  _>  jr. 

Then  (x^  +  y,x2  +  y)  »  (1,  x2  +  (1-x^)).  Applying  these  facts, 
together  with  the  uniformity  of  the  marginals  of  F  and  the  natural 
monotinicity  of  F  ,  we  obtain 

G(x1  +  y.,x2  +  y)  =  F(l,x2  +  (1-Xj))  >_  F(x)  . 

Thus,  within  the  range  (3.6)  there  is,  by  the  continuity  and  strict 

monotonicity  of  G,  exactly  one  y  satisfying  G(x^  +  y,x2  +  y)  «  F(x)  . 

Hence  the  map  m  is  well  defined. 

Now  consider  the  processes  W  and  W  .  We  shall  compare 

F  G 

sup{WF(x)  :xel^  to  sup{WG(x):  xem(I^)  }  ^  sup{WG(x) :  xel^}.  Note 

2 

firstly  that  for  xel 


(3.7)  var  W  (x)  =  var  W  (m(x))  , 


a  simple  consequence  of  (3.4)  and  (1.8).  Consider 


(3.8)  cov(WF(x),  Wp(y))  »  F(xAy)  -  F(x)F(y)  . 


Suppose  x*y  ■  x.  Then 
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F(xAy)  «  F(x)  “  G(m(x) )  ^  G(m(x)Am(y)) 

Thus,  in  this  case  (as  In  the  analogous  case  XAy  *  y) 

(3.9)  cov(Wp(x)  ,Wp(y))  >_  cov(WG(m(x))  ,WG(m(y)))  . 

If  we  can  also  establish  (3.9)  In  general,  then  we  shall  have  completed 
the  proof  of  the  first  part  of  the  theorem,  viz.  (3.2),  since  (3.7) 
and  (3.9)  are  precisely  the  ingredients  for  Slepian's  inequality. 

Thus,  consider  (3.8)  fcr  x,y  with  x^>  y^  and  x^<  y 2  .  (The 
remaining  case  is  handled  analogously.)  Then  XAy  «  (y^.x^).  Write 
w  ■  (m^(y),  m^ (x) ) .  There  are  three  possible  cases  to  consider: 
m(x)  _>  w  >.  m(y)  ,  m(y)  >  w  >  m(x) ,  w  “  m(x)  Am(y)  .  We  shall  consider 
only  the  third  case  explicitly,  but  the  reasoning  is  valid  for  all  the 
cases.  Note  (drawing a  picture  helps  to  see  the  inequalities)  that 

F(xAy)  -  F(y1,x2) 

i  [F(x)  -  (x^-y^  v  [F(y)  -  (y2~x2^  ^y  marginal  uniformity 

>  Js{F(x)  +  F(y)  -  [(x1-x2)  -(yi->2)]  > 

-  Js(G(m(x))  +  G(m(y))  -  [(m1(x)-m2(x))-(m1(y)-m2(y))]  ) 

by  (3.4),  (3.5) 

>  m^x)  +  m^(y)  -  1  .  by  (3.1). 

Hence,  if  m^x)  +  m^(y)  -  1  0  then  the  above  yields 


(3.10) 


F(x«y)  21  G(m1(y),m2(x))  . 


On  the  other  hand,  if  m^x)  +  m^(y)  -  1  <  0,  then  G(m^(y) ,  m2(x))  “  0 
and  bo  (3.10)  is  trivially  true.  Thus,  in  general, 

F(xAy)  ^.GCn^Cy),  m2(x))  *  G(w)  =  G(m(x)Am(y))  . 

From  this  we  immediately  obtain  (3.9)  and  the  proof  of  (3.2). 

It  remains  to  establish  the  inequality  (3.3).  To  this  end,  let 
W(t),  te[0,l],  be  a  standard  Brownian  bridge  with  covariance  function 

(3.11)  E{W(t)W(s)}  »  (sAt)  -  st  . 

1  >_  0 

1  <  0 

X  is  a  version 

(3.12)  P{supWG(xltx2)  >  X)  «  P{8up[w(x1>-^(l-x2) :  Xj+Xj  -  1  >  0]  >  X} 

■  P{sup[w(s)  -  W(t) :  s  >t]  >  X} 

<.  P{sup[ft(s)  -&(t):  s,  te[0,l]]  >  X} 

<.  P{[sup$(s))  +  +  sup  #(•))"]  >  x) 

[0,1]  [0,1] 

But  the  last  probability  is  known  exactly,  having  been  determined  in 
Kac,  Kiefer  and  Wolfowitz  (1955,  equation  (4.6)),  and  is  precisely  the 
sum  given  on  the  right  of  (3.3),  and  so  we  are  finished. 


Remark.  Note  that  the  two  inequalities  following  (3.12)  are  far  from 
sharp,  and  a  little  reflection  shows  that  each  inequality,  while 


retaining  a  bound  of  the  right  order  of  magnitude,  "costs",  roughly, 
a  factor  of  two,  i.e.,  we  expect  that  the  final  upper  bound  is  too 
large  by  a  factor  of  four.  Indeed,  comparison  of  the  general  upper 
bound  (3.3)  with  the  specific  lower  bound  in  the  uniform  case,  (2.3) 
with  k=2,  shows,  for  large  X,  a  difference  between  the  bounds  of 
precisely  a  factor  of  four.  Clearly,  a  much  better  upper  bound  than 
(3.3)  is  given  by  P{sup[0(s)  -  $(t):  0  <  t  <s<l]:>  X),  (c.f. 

(3.12)),  but  this  seems  hard  to  calculate.  However,  numerical  estimates 
of  this  probability  are  easy  to  obtain  via  simulation,  and  some  are 
listed  in  Brown  and  Adler  (1984).  Furthermore,  calculation  of  (3.3) 
and  comparison  with  (2.3)  for  moderate  X  ,  say  Xe  [l,3]  , 
yields  that  (3.3)  overestimates  the  true  probability  by  less  than 
a  factor  of  four,  and  that  the  KS  test  statistics  derivable  from 
(3.3)  are  in  fact  quite  useful.  For  details  see  Brown  and  Adler  (1984). 

We  now  turn  to  the  more  difficult  problem  of  finding  a  uniform 
lower  bound  for  the  two-dimensional  case.  Here  we  shall  need  to 
Impose  assumptions  on  F  in  order  to  avoid  degeneracies,  (e.g. , 

F  concentrated  on  the  diagonal  x]“x2  >  which  reduces  to  the  one¬ 
dimensional  case.)  Let  i|x||  -  IxJ  +  |x2|  denote  the  "city  block" 

norm  of  X.  Then  we  shall  prove 

2 

Theorem  3.2  Let  F  be  a  d . f .  on  I  ,  with  uniform  marginals .  such 

2 

that  there  exists  an  xoeI  ,  a  neighbourhood  N  of  xQ  ,  and  a 
constant  8  e  (0,l]  satisfying 
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(3.13)  F(xe)  -  h 


(3.14)  If.  x,yeN  and  either  Xj-y^'or  *2«y2  then  |F(x)-F(y) 

>  e| 1 *-y 1 1 • 

Then  there  exists  a  finite  c  «  c(F)  >  0  such  that 

(3.15)  P{supW_(x)  >  X)  j>  cA2e  2*  . 


Remarks .  Theorem  3.2,  as  it  stands,  is  a  special  case  of  the  more  general 
result  Theorem  4.2.  What  makes  it  of  special  Interest,  however,  is 
the  fact  that  in  two  dimensions  it  is  possible  to  obtain  estimates  for 
c.  We  shall  discuss  these  at  the  end  of  the  proof.  Furthermore,  the 
two  dimensional  case  turns  out  to  be  somewhat  simpler  than  its  higher 
dimensional  analogue,  thereby  making  its  proof  more  transparent  and 
interesting. 

It  is  clear  that  the  conditions  of  Theorem  3.2  hold  if  F  has 
a  density  bounded  away  from  zero.  However,  absolute  continuity  is  not 
a  requisite  of  the  theorem,  and  it  is  easy  to  build  examples  of  non-absolutely 
continuous  F  satisfying  (3.13)  and  (3.14).  A  trivial  example  is  the 
extremal  case,  (3.1). 


Proof  of  Theorem  3.2.  The  aim  of  the  proof  will  be  to  compare  W_ 

with  W  ,  where  F  is  the  distribution  function  (2.5)  of  the 
e  6 

preceding  section,  and  then  use  Slepian's  Inequality  and  Theorem  2.2 

to  complete  the  argument.  The  comparison  will  only  be  possible  over 

a  region  in  the  neighbourhood  of  (**,*4)  in  the  domain  of  W^,  ,  together 

e 

with  a  subset  of  N  in  the  domain  of  W  ,  but  it  will  turn  out 

F 

that  such  a  comparison  will  suffice  for  our  purposes.  We  start  by 


:‘vvJ 


Sv-vV;: 
**  •*. 


%  %  *.  *- 
■  vy  - 


19 


building  a  mapping  between  the  above  two  neighbourhoods,  and  by 

noting  that  the  reader’s  path  through  the  forthcoming  algebra  will 

be  considerably  simpler  if  he  follows  the  argument  graphically  with 

pen  and  paper. 

2  A 

For  xeR  ,  let  x  be  the  projection  of  x  on  the  diagonal 
{x:  Xj^x^}  ,  i.e.,  x  has  both  coordinates  equal  to  JsCx^+x^)  •  Define 

(3.16)  d  =  d(N)  :  -  inf{| |x-xq | | ,  x  *Nh 

Let  e  ■  dB/3  and  define  a  map  m  from  A£  into  N  satisfying 

(3.17)  m(y)  -  m(y)  =  x0“*0  + 
and 

(3.18)  F£(y)  -  F(m(y))  , 

where  F£  is  defined  at  (2.5).  It  is  necessary  to  demonstrate  that 
this  map  is  well  defined  and  one-one. 

a 

To  this  end,  fix  yeA£  and  let  mQ  ■  xQ  +  (y-y)/B  .  Also, 
let  «  mQ  +  y(l,l),  for  real  y.  How  note  that,  by  (3.14),  F(m^) 

is  strictly  increasing  in  y  as  long  as  m^eN.  Furthermore,  if 

A 

y  -  e/8,  then  my  ^  xq  ,  since  (y-y)^  <  E  for  yEA£  .  Similarly, 
y  ■  -e/8  implies  <  xQ  .  Consequently 


(3.19)  F(m_£/g)  1%  1  FC®e/g)  * 

Now  consider  for  what  values  of  y  we  shall  have  m  eN  . 

y 

Clearly  (m^  “  (x0)  ^  <  e/B  +  y»  i«l,2,  since  y£A£  .  Hence  m^  e  N 
for  |u|  <  2e/8.  Now  take  y>y',  with  |y|v|y'|  <  2e/8  .  Then,  by 
(3.14), 

F(my)  -  F(my,)  >  B^-u')  . 
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But  now  it  follows  that  there  is  a  unique  pe[-2e/B,2e/8]  such  that 

F(m  )  *  F  (y),  since  yeA  implies  %-e  <_  F  (y)  +  e  .  Let 

y  e  c  e 

m(y)  -  m^  for  this  y.  Then  clearly  (3.18)  is  satisfied,  as  is  (3.17), 
so  that  m  is  well  defined  for  each  yeA^  .  Furthermore,  the  above 
argument  also  establishes  that  m  is  one-one.  This  completes  the 
first  part  of  the  argument. 

Let  m(A_)  be  the  image  of  A.  under  the  mapping  m,  and 
consider  «F(x)  for  xem(A  ) .  Clearly,  for  yeAe,  (3.18)  and  (1.8) 
imply 

(3.20)  E{Wp(m(y))  }  =  E(w|  (y)  }  . 

Now  take  ^1*^2 eAe*  111611  Fe^yl^  <  Fe^y2^  and  so 

F(m(y1) )  <  F(m(y2))  .  Thus  (1.8)  immediately  yields 

(3.21)  E{WF(m(y1))W¥(m(y2))  }  =  FGntyj)  *m(y2) )  -  F(m(y1))F(m(y2)) 

F(m(y2))AF(m(y2))  -  F(m(y2))F(m(y2)) 

-  Fe(y2)  -  Fe(yi)FE(y2) 

-  E{WFe(yl)WFe(y2)}* 

By  symmetry,  (3.21)  also  holds  for  y^  >  y2  .  Now  suppose  y^-y^Ay.^ 
distinct  from  both  y^  and  y2  .  Set  xi*=m(y^) ,  1*1,2,  and 
x2*x2ax2  .  Observe,  either  geometrically  or  algebraically,  that 

(3.22)  |  ly1-y3l  I  +  lly2-y3!l  ■  ll(yryi>  "fy2"y^ll  • 


Thus,  since  F£  has  uniform  marginals,  and  y^eA  , 


(3.23)  Fe(yi)  +  Fe(y2)  -  2Fe(y3) 

<  (»S  +  e)  (I  |y.,-y3|  I  +  |  ^2^3!  I> 

-  (h  +  e)  II  (y^yj)  -  (y2-y2)l  I  * 

Now  suppose  that  x2  ^  x^.  We  shall  show  that  this  is  impossible. 
Write  x^  0  (C*j) ^ »  (*2)2)  *  Then»  fcy  geometry  and  assumption  (3.14) 

F(x2)  -  F(Xl)  =  F(x2)  -  F(x4)  +  F(x4)  -  F(Xl) 

1  6 C (  | x2~x4 I  [  +  |  [x^xj  (} 

.  1  B| I (Xj-Xj)  "  (x2“xi) ! I 

-  Il(y2-y1)  -  (y2-yL) 1 1 » 

the  last  line  following  from  (3.17).  The  above  and  (3.23)  now  yield 
0  <  F(x2)  -  F(Xj)  =  Fc(y2)  -  Fe(yi) 

i  re(y2)  +  Fc<yi)  -  2P£(y3) 

<(h+  €){«*,)-  FCXj))  , 

which,  since  e  <  Jj,  is  clearly  untenable.  Thus  we  cannot  have  x2  ^  x 
nor,  by  symmetry,  x^  x2*  Consequently,  x^  ■  .^Xj  is  distinct 
from  both  x^  and  x2  .  Then,  again  by  geometry,  assumption,  and 
(3.22) ,  we  have 

F(x^)  +  F(x2)  -  2F(Xj)  _>  B{ )  jx^-x^j  |  +  |  |x2~x3|  |  } 

-  8{1| (Xj-J^)  -  (x2-x2)||> 

-  IlCyi-y^  -  (y2-y2)ll 

-  1 17^1 1  +  l|y2-y2ll 

>  (Jf^)'i5{Fe(y1)+Fe(y2)-2Fe(y3)> 


>  Fe(y1)+fe(y2)-2Fe(y3)  . 
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Thus,  since  FCx^  -  F£(yi),  i-1,2, 

Fe(y3)  -  F(m(y3))  >  F(x3)  . 

From  this  it  immediately  follows  that  for  all  y^y^  G  A 

E{Wf  (yx)W  (y2)}  >  E{WF(m(y1))WF(m(y2))>  , 
e  e  x  r  ^ 

with  strict  inequality  if  x^ax2  ^  x^,  i=l,2.  But  this  is  all  we 
need,  for  by  Slepian's  inequality, 

P{sup  W  (x)  >  X}  >  P{sup  W_(x)  >  X} 

X2  F  m(Ae)  F 

>  P {sup  W_  (x)  >  X>. 

A  Fe 
e 

The  last  probability  is  precisely  that  given  by  the  KHS  of  (2.9), 

which,  as  we  have  already  noted,  is  asymptotically  of  the  form 
-2  x2 

cXze  .  This  completes  the  proof  of  the  theorem. 

We  close  this  section  with  two  remarks.  The  first  on  the  constant 

c  of  Theorem  3.2  ,  or,  to  be  more  precise,  on  an  exact  lower  bound  for 

P{sup  Wp(x)  >  X}.  It  is  clear  from  the  argument  that  such  a  bound 

is  given  by  Q  (X)  +  %  (X),  with  e  -  d&/3  .  (c.f.  (2.7),  (2.8), 

e  *■  »e 

(3.16).)  If  we  consider  the  case  of  F  uniform,  the  optimal  choice 
of  dB ,  so  as  to  maximise  e,  is  d  =  B  -  l/(2r^2),  yielding  e  5  0.04. 

This  is,  of  course,  much  smaller  than  the  e  «  \  that  a  sharp  argument 
would  give.  Nevertheless,  the  numerical  consequences  of  this  lack 
of  sharpness  are  not  quite  as  bad  as  one  might  imagine.  For  details, 
see  Brown  and  /.dler  (1984). 


It  is  interesting  to  note  that  there  are  "l^-dimensional"  d.f. ’s 


that  yield  supremum  tail  probabilities  strictly  between  the  one- 

_2X2  9  -21^ 

dimensional  0(e  )  and  two-dimensional  0(X2e  ).  As  an  example 

2 

take  H  to  be  the  d.f.  on  I  with  density 


t 


(3.24)  h(x,y)  =  2  (x,y)  <  (%,%)  or  (x4y)  >  (%,%) 

„  0  otherwise. 

Clearly,  H  fails  to  satisfy  the  conditions  of  Theorem  3.2.  However, 

it  is  a  relatively  easy  exercise  to  estimate  the  exceedence  probabilities 

of  W  ,  using  the  fact  that  the  two  processes 
H. 


^(x.y):  =  /2(Wh(x/2,  y/2)  -  W^Ps.^)} 

W2(x,y);  =  ^{Wh(1-x/2,  l-x/2)  -  }  , 

2  o(2) 

(x,y)  e I  ,  are  both  versions  of  the  pinned  Brownian  sheet  W  . 

This  fact,  together  with  Theorems  2.1  and  3.2,  conditioning  on  and 

then  integrating  out  W^Os.Jj)  ,  readily  yields 


_o  \2 

(3.25)  P(sup  W  (x)  >  X)  =  0(Xe  )  , 
i*  H 

thus  indicating  that  non-even  powers  of  X  in  tail  bounds  cannot  be 
excluded.  (Indeed,  there  is  no  good  reason  even  to  exclude  non¬ 
integer  powers,  as  these  do  occur  as  tail  bounds  for  other  classes 
of  Gaussian  processes;  see,  for  example.  Section  12.2  of  Leadbetter, 
Lindgren  and  Rootzen  (1983).) 


BOUNDS  FOR  THE  GENERAL  CASE 


Our  aim  in  this  section  will  be  to  obtain,  in  k  >  2  dimensions, 
bounds  of  the  same  general  form  as  those  we  have  just  obtained  for 
two  dimensions.  In  particular,  if  F  is  a  continuous  d.f.  c  I 
with  uniform  (one-dimensional)  marginals,  then  the  two  central  results 
are  as  fellows: 

Theorem  4.1  There  exist  constants  c^,  k  >.  1,  independent  of  F  and  X  , 
such  that  for  F  as  above 

(4.1)  P(supWF(x)  >  X}  <.  ckX2(k'‘1)e"2^. 

lc 

Theorem  4.2  Suppose,  in  addition  to  the  above,  there  exists  an  xoeI  , 
a  neighbourhood  N  o£  xq,  and  a  constant  j)  >  0  satisfying 

(4.2)  .  F(xq)  =  k, 

(4.3)  Throughout  N,  F  possesses  continuous  first  order  partial 
derivatives  :  =  8F/3x^  satisfying 

inf  inf  <|i,  (x)  -  jp  >  C  . 

IN1 

Then  for  each  such  F  there  exists  a  constant  c  ■  c(F) ,  Independent 
of  X  ,  such  that 

(4.4)  P{sup  W  00  >  X)  .  cX2<k"1)e'212. 

it  F 

Both  of  these  results,  while  clearly  indicating  the  correct  order 

of  magnitude  behaviour  of  the  tail  of  sup  W_  ,  are  considerably  weaker 

F 
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than  their  two-dimensional  counterparts,  since  the  style  of  their 
proofs  will  be  such  that  it  will  be  impossible  to  closely  monitor 
inequalities  so  as  to  estimate  the  constants  of  the  bounds.  Conse¬ 
quently,  the  statistical  value  of  Theorems  4.1  and  4.2  is  somewhat 
limited.  Nevertheless,  they  have  interesting  probabilistic  conse¬ 
quences,  as  we  shall  see  in  Section  5,  as  well  as  being  of  intrinsic 
interest  for  the  reasons  mentioned  in  the  introduction. 

We  shall  prove  Theorem  4.1  first,  by  a  method  totally  different 
from  that  used  for  the  two-dimensional  upper  bound.  There,  recall, 
the  argument  was  based  on  finding  a  "worst  possible  F".  In  dimensions 
three  and  above  there  seems  to  be  no  analogous  unique  worst  F,  and 
the  proof  is  forced  to  take  a  different  route.  We  start  with  some 

necessary  lemmas,  for  which  we  define  the  following  event  for 
lc 

Xl,X2e*  *  xl  — x2 

(4.5)  A  *  ACx^Xjj.A)  :  -  {sup(WF(x)  :  xx  <_  x  _<  x2>  >  1}  , 

Also,  write 

(4.6)  o2(x)  :  -  var(Wp(x) )  -  F(x)  [l-F(.;)  ]. 

k 

Lemma  4.1  Take  \  <  a  <  h  ,  el  ,  x^  x^  and  \  >  1.  If 

(4.7)  a  1  F(xx)  1  F(x2)  <.  1-a, 
and 

(4.8)  F(x2)  -  F(xl)  <  Ssc2*'2, 
then 

(4.9)  P(a)  0(1  ^exp(-l2/2a2(xj)) , 

where,  for  any  function  f:  R  -*•  R, 

f(a)  <_  0(a)  <=>lim  sup  (f(a)/a)  K  <  «►  . 


26 


Proof.  Since  it  is  generally  difficult  to  work  with  the  maxima  of 
the  pinned  sheet  Wp,  the  main  idea  of  the  proof  is  to  relate  Wp  to 
its  unpinned  version,  Zp,  where  Zp  is  the  zero  mean  Gaussian  field 

k 

on  I  satisfying 


E{Zp(x)  •  Zp  (y) }  =  F(xAy)  . 


Then  Zp(x)  -  F(x)Zp(l)  is  a  version  of  Wp  ,  so  that  using  this  version 
in  all  that  follows,  we  can  write 


(4.10)  Wp(x)  =  V(x)  -  [F(x)  -  F(Xl)]Z(l) 


x€[x1,x2  ] 


where 


(4.11)  V (x)  :=  W(x1)  +  [Zp(x)  -  Zp(Xl)] 


x€[x1,x2] 


The  idea  of  the  proof  is  that  for  X  large.  (4.8)  implies  the  second  term 
in  (4.10)  will  be  small,  while  V(x)  will  be  close  to  W(x^). 

Note  firstly,  by  direct  calculation  of  covariances,  that  Wc(x)  and 

r 

Zp (1)  are  independent,  so  that  with  A  as  at  (4.5) 


(4.12)  P{A}  =  2P(A  and  Zp(l)  >  0} 


Thus,  by  (4.10) 


(4.15)  P{A}  <  2P{Zp(l)  >  0  and  sup  V(x)  >  X} 

[x1»x2J 

<  2P{  sup  V (x)  >  X} 

tvx2] 

To  bound  the  last  probability,  write  V(x)  =  W(x^  +  U(x).,  where 


U(x)  :=  Zp(x)  -  Z_ (x. ) 


is  independent  of  W(x^) .  Suppose  we  can  show  the  existence  of  a  finite 
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c  >  0  such  that  for  all  n  >  0 

-1  -rt2>2/„2 

(4.14)  P{  sup  U(x)  >  n)  <  C(1  +  nX/a)  e 
[x1,x2] 

Then,  allowing  c  to  vary  from  line  to  line,  and  setting  a2  =  a2(x1) 
for  notational  convenience,  we  have 

X 

P{A}  <  2P{W(x1)  >  X)  +  /P  (supU(x)  >  X  -  w}dP{W(x1)<  w} 

-oo 

<  cX  *exp(-X2/2c2)  +  c  /(I  +  X  -  w)  *exp{-  - — —  w^-  — — }dw, 

-®  a2  2a2 

on  using  standard  inequalities  for  the  first  probability,  and  (4.14)  for 

the  integrand,  after  noting  X  >  1  and  a  '<  1/2  .  Standard  integration 

-1  -Xz/2o2 

yields  that  the  integral  is  0(X  e  )  .  This  proves  the  lemma. 

Thus  all  that  remains  is  to  establish  (4.14). 

A  straightforward  application  of  the  multivariate  "reflection  principle" 
yields 

P{  sup  U(x)  >  X)  <  2k  P(U(x  )  >  X)  . 

[V*2] 

By  (4.8)  varU(x2)  <  %  a2X  ,  so  that  (4.14)  now  follows  by  standard 

inequalities. 

Without  much  extra  work  we  can  also  prove  a  stronger  version  of 
the  preceding  lemma.  Under  the  conditions  of  the  lemma,  we  have, 
for  xx  ^  x  <  x2  that 

o2(x)  >_  a2  -  5a2/(4X2)  . 


Consequently 


A2  l2  X2  .A2 

2o2(x)  2o2  _2[o2-5a2/4X2]  2a2 


5a2/4 

■  1  11  J 

2<J2[o2-5a2/4X23 


0(1) 


Thus  lemma  4.1  immediately  yields 


Lemma  4.2  Under  the  conditions  and  notation  of  Lemma  4.1 

(4.15)  P(A}  <.  0(X_1exp(-X2/2a2))  , 
where  a2  *  inf{a2(x)  :  x^  <_  x  <_  x^}. 

To  state  the  next  lemma  define  the  event 

B  «  {sup(W  (x)  :  F(x)  <  a  or  F(x)  >  l-a)>  X} 
r 

Lemma  4.3  Let  ae(0,ii),  and  0e(l,(4a(l-a))  *) .  Then 

(4.16)  P{B)  <_  0(exp(-2{3A2))  . 

Proof.  This  is  a  straightforward  application  of  (1.4)  ,  on  noting 
that  F(x)  <o  and  F(x)  >  1- a  both  imply  (2o2(x))-1  2[4  c<l-a)]_1 

We  now  turn  to  the 

Proof  of  Theorem  4.1.  The  idea  of  the  proof  is  as  follows.  Divide 
k 

I  into  a  large  number  of  small  cubes,  and  separate  these  cubes  into 

two  groups.  In  the  first  group  put  those  cubes  over  which  W  has 

F 

small  variance,  and  use  Lemma  4.3  to  show  that  the  maximum  of  W„ 

F 

over  this  group  it.  asymptotically  unimportant.  For  the  second  group, 

use  Lemma  4.2  to  bound  the  (distribution  of  the)  maximum  of  W_  over 

F 

each  cube,  then  count  how  many  such  cubes  there  are,  and  thus  obtain 
a  final  bound. 

We  now  spell  out  the  proof  in  detail,  and  note  that  the  only 
real  difficulty  lies  in  finding  a  convenient  labelling  system  for 
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the  various  cubes.  We  commence  with  cubes  over  which  W^,  has  large 
variance  (i.e.,  close  to  *5 .).  Fix  the  dimension  k  ,  choose  cte(5<,*5), 

X  >  1,  and  sety=  a2/(2kX2)  .  Let  1  =  (1 . 1) ,  and  let  xel^  be 

such  that  also  jd-yle  I  .  Then  the  uniformity  of  the  marginals  of 
F  implies 


(4.17)  F(x+Yl)  1  F(x)  +  Yk  =  F(x)  +  a2/(2x2)  . 

Now  consider  the  lattice  of  points  of  the  form  Y(n^». • • »n^) , 
where  n^  =  0,1,..., [y  ^].  Then  each  of  these  points  has  a  unique 
expression  as  p  +  jyl  ,  where  peir  and  ir  is  the  set  of  y(n^,...,n^) 
with  minOi^:  1  i  <_  k)  =  0  .  For  each  peir  define,  inductively, 

Jl  “  J1(p)  “  max{ j :  F(p+jyl)  <*  o}  , 

Ji  "  Ji(p)  "  max^:  F<P  +  JyJ)  -  F(p  +  j^yl)  1  a2/(2X2)  . 


Furthermore,  define 


J  -  J(p)  =  min{i:  F(p  +  j  yl)  >  1  -  a). 

1  * 

Note  that  (4.17)  implies  >.  1  for  all  i  and  p.  Also,  for 

1  <  i±  1  J, 

(4.18)  0  <  a  -  o/(2X2)  <_  F(p  +  J^l)  <_  1  -  a  +  a2/(2X2)  <  1  . 

Now  set  j*(p)  *  j_  -1,  and  define 

J(P> 

j*(p) 

S(p)  ■  U  {x:  p  +  (j.  +  k)yl  <  x  <  p  +  (j.  +k+  l)yl}, 

k-0  l  >  -  i 


J(p)-1 

S*(p)  ■  U  {x:  p  +  j  yl  <  x  <  p  +  j  .yl}. 
i-1  1  1+1  " 


and 
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(Drawing  a  picture  for  k=2  will  undoubtedly  make  the  following  argument 
appear  more  natural.)  From  the  definitions  of  S  and  S*  it  is  clear 
that 

{x:  o  F(x)  ^  1  -  a)  £  U  s(p)  5  U  s*(p)  . 

pEIT  pETT 

Thus,  with  A  as  at  (4.5) 

(4.18)  P{sup(W  (x):  a  <  F(x)  <  1-a)  >  X) 

r 

<  P{sup(W  (x):  xeS*(p)  for  some  pern)  >  X} 

r 

J(p)-1 

£  Z  Z  P{A(p+j±Yl,  P+ji+1Yl,  *)> 

pETT  i=l 

J(P)"1 

<  I  0(  I  X  exp[-x2/2a2(p+j1Yl)])  , 

pern  i*=l 

the  last  inequality  following  from  Lemma  4.2.  Now  note  that  for  all  i,  p, 

'  F(p  +  h+ 2y-)  ”  F(p  +  JiYP  >  a2/2x?'  » 

and  set  I  =  min{i:  \  -  ia2/2X2  <  a}  -  min{i:  \  +  ia2/2X2  >  1  -  a}.  Then 
it  follows  that  the  sequence 

(4.19)  {a2(P+jiYl):  i-1 . J(p) } 

is  dominated  by  the  sequence 

(4.20)  »Sq  >  3q  *  3q  i  •  •  •  * » a^»  a^_ » •  •  • » aj*aj » aj  * a  j  } 

£ 

in  which  a^  ■  k  -  (ia2/2X2)  ,  where  by  "domination"  we  mean  that  the 
elements  of  (4.20)  may  be  rearranged  so  that,  termwise,  they  dominate 
corresponding  elements  of  (4.19).  Furthermore,  there  may  also  be  more  terms 


31 


In  (4.20)  than  in  (4.19).  As  a  consequence  of  this  we  have  that 

J(p)-1  _i  r  o  o  T 

E  A  exp[-X2/2o2(p+j  yl)J 

i-1 

<  4  E  A_1exp[-X2/2(Ss-(ia2/2x2)2)] 
i“l 

j<  4  E  X  ^exp[-2A2(l+(ia2/A2)2)  ] 
i*l 

*  4e  2X  E  A  ^exp[-2a1*i2/A2] 
i=l 


e_2X20(/V2a4ydy) 

0 


-2X2 

0(e  )  . 


Note  that  n  has  at  most  (2  +  2kA2/a2)  “  0(X2^  ^ )  points.  Combining 

this  fact,  the  above,  (4.18)  and  Lemma  4.3  yields 


P (sup  W  (x)  >  A) 
Ik  F 


_<  P{8up(W^,(x)  :  a  ,<  F(x)  _<  1-a)  >  A}  +  P{sup(Wj,(x) :  F(x)  <  a 


or  F(x)  >  1-a)  >  A} 


-  0(X2(k-1>e"212)  +  0(e*2812) 


o(x2(k-«.-212)  . 


This  completes  the  proof  of  Theorem  4.1. 

Proof  of  Theorem  4.2  Our  aim  here  will  be  to  attempt  to  mimick  the  proof 

of  the  two-dimensional  case.  Theorem  3.2,  by  comparing  W^,  to  W^,  .  However, 

e 
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for  k  >  2  dimensions  the  mapping  on  which  the  comparison  is  based  is 
definable  as  a  linear  mapping  only  in  an  arbitrarily  small  neighbourhood 
of  the  point  xD  of  the  theorem,  and  we  shall  not  be  able  to  say  any¬ 
thing  concrete  about  the  size  of  the  neighbourhood,  and  thus,  a  fortiori, 
anything  non-asymptotic  about  the  lower  bound  that  we  shall  obtain. 

The  first  part  of  the  proof  carefully  sets  up  some  geometrical 
structures,  and  is  totally  non-probabilistic.  Probability  will  enter  only 
when  the  groundwork  is  ready. 

k 

Let  G  be  the  uniform  distribution  on  I  ,  and  y*  the  point 
(Jj)1/k  .  Then  G(y*)  **  \  and 

(A. 21)  Y  :  =  (y*)  -  (yCk'1)/k  *  . 

^i 

Note  that  Y  is  independent  of  i.  In  order  to  compare  F  to  G,  it  is 
convenient  to  consider  new  coordinate  systems  for  F  and  G  ,  obtained 
by  rotation  and  translation.  To  this  end,  let  ♦  =  >,  and 

write  Y  interchangeably  for  the  constant  (A. 21)  and  the  constant  vector 
Yl  .  Define  the  unit  vectors,  with  |  | .  |  |  now  denoting  the  usual  Euclidean  norm, 

\  *=  ♦/|U||  ,  -  y/HyII  , 

and  extend  to  two  orthonormal  bases  V:  ■*  (V^,...,V^}  and  W:  ■  {W^,...,W^} 
for  R  .  Choose  the  origins  of  the  new  spaces  to  be  x*  and  y*,  res¬ 
pectively.  Then  if  v(x)  and  w(y)  are,  respectively,  the  representations 
of  x  and  y  in  the  new  coordinate  systems,  we  have 

v(x*)  «■  0,  v^x)  -  (x-x*)  /  [( ifi  |( 

w(y*)  »  0,  w^y)  ■  1'  (y-y*)/^  -  y'  (y-y*)  / 1| Y (|  • 


The  d.f. 's  F  and  G  can  be  transferred  in  a  natural  fashion 


to  V  and  W  space,  respectively.  Let  F  and  G  be  the  corresponding 
functions,  defined  by 


F(v(x))  =  F(x) ,  G(w(y))  =  G(y)  . 

(Note  that  F  and  G  are  not  necessarily  d.f.  *s  on  V  and  W  space.) 
V  w 

Now  define  maps  tt  and  n  from  V  and  W  space,  respectively,  to  the 

original  domains  of  F  and  F^,  by 

*V(v(x))  »  x-x*,  :rW(w(y))  =  y-y*  . 

V  w 

Thus,  tt  and  tt  transform  from  the  coordinate  systems  of  the  V  and 
W  spaces  to  systems  centered  at  x*  and  y*  but  oriented  like  the 
original  cartesian  system. 

We  shall  need  to  impose  on  the  V  and  W  spaces  concepts  of 

ordering  inherited  from  the  original  spaces.  To  this  end,  write 


v(1)«  v(2)  <=>  "^(v^)  _<  iriv(v(2)),  i=l . k  , 


w(1>«  ~<2> 


W<2>  <*=>  <  TT^Cw^),  1=1,..., k. 


and  define  v^^  2}  v^2^  and  w^  A  w^2^  accordingly. 

This  completes  the  necessary  geometrical  groundwork.  We  now  build 
the  mapping  upon  which  the  comparison  between  F  and  G  will  be  based. 
Let 


i  ixi  i  v 

ii*ii  ii*ii 


p  max{^i:i=l, . . . ,k} 
min{^:i=l, . . .  ,k}* 


3  ■  p  +  2a. 


Define  the  mapping  m  =  (m^,...,m^)  from  a  neighbourhood  of  zero  in  W 
space  to  a  neighbourhood  of  zero  in  V  space,  via  its  coordinate  mappings 
by  firstly  setting 
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(4.22)  m^(w)  ■  i“2,...,k 

and  then  choosing  m^(w)  such  that 

(4.23)  F(m(w))  -  0( w)  . 

We  need  to  check  that  m^  is,  in  fact,  well  defined.  For  w^  ■  0, 
i»2,...,k,  and  general  w^,,  G(w)  is  clearly  strictly  increasing  as 

a  function  of  w^.  Furthermore,  since  the  unit  vector  of  the  V 

space  has,  as  a  vector  in  the  original  space,  strictly  positive  coordinates, 
it  follows  that  F(v^,0, . . . ,0)  is  strictly  increasing  as  a  function  of 
.  Since  F(0)  «  G(0)  **  h,  it  follows  that  m  is  well  defined  for  w 
of  the  form  (w^,0, . . . ,0) .  The  implicit  function  theorem  now  defines 
nt  uniquely  for  sufficiently  small  neighbourhoods. 

Having  defined  our  mapping,  let  us  consider  some  of  its  properties. 
Note  that  for  small  neighbourhoods  of  the  origin 

G(w)  «  h  +  |  |y|  +  o(|  |w|  |)  , 

F(v)  -  h  +  i  | ip|  |v1  +  o(]  |vj|). 

Combining  these  facts  with  (4.22),  (4.23),  and  the  definition  of  g 
we  obtain  that  for  small  w 

(4.24)  m^(w)  -  pw^  +  o ( 1 1 w j  | )  . 

Consequently,  for  small  w*, 

(4.25)  |[®1(w1)  -  m^w2)]  -  [pw*  -pw2]j  £  1 1 w1— w2 1  ) 

Now  let  q  be  the  linear  map  approximating  m;  i.e.,  set 

(4.26)  qx(w)  -  pWl,  q^(w)  -  gWi,  i-2,...,k  . 


Then  by  (A. 25),  for  small  w  , 


(4.27)  l[m(w1)-m(w2)]  -  [q(w1)-q(w2)  ]  |  ^  ||w1-w2||. 

Finally,  note  that  as  a  consequence  of  (4.28)  we  also  have 

(4.2  )  ir1(m(w1)-m(w2)  -  (q(w1)-q(w2)) )  _<  |  Jw^-w2!  |. 

This  completes  our  listing  of  properties  of  m  and  its  linear 

approximation.  We  can  now  turn  to  the  final  part  of  the  proof,  the 

comparison  of  W  and  W  ,  which  we  commence  by  comparing  F  and  G. 
r  G 

Firstly,  let  N  be  a  small  enough  neighbourhood  of  zero  in  W 

i  12 

space  so  that  (4.24)  -  (4.28)  are  true  for  w  eN.  Take  w  ,  w  eN 
1  2 

with  w  fiw  eN  .  Suppose 
12  « 

(4.29)  w  *w  »  \r  for  p  ■  1  or  2  .• 

Then 

(4.30)  F(m(w^)*m(w2))  »  F(m(w^))  <*  F(m(w2)) 

■  G(w^)  *  G(w2) 

-  G(wP) 

”1  2 

■  G(w  *  w  ) 

3  1  2  P 

Now  consider  the  case  w  ■  w  *  w  /  wp  for  either  p  =  1  or  2.  We 

shall  obtain  (4.30)  also  for  this  case,  but  with  inequality  replacing  the 
equality.  For  each  coordinate  j»l,...,k,  w3  -  w1  a  w2  implies  that 

wj(w*)  “  ^(w3)  ■  0  or  ffj(w2)  ~  ^(w3)  =  0  . 

Fix  j,  and  let  p:  -  p(j)  -  1  or  2  be  such  that 

(4.31)  »W)  -  ^(w3)  -  0  . 
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Thus  w3  -  <WP^ . wP^k*>.  Rewrite  q  as 

1  k 

q(w)  -  6w  +  (p-^WjV^  , 

and  apply  this  to  (A.  31)  with  w  *  wP-w3  to  obtain,  via  the  linearity 
of  *3  ' 

(4.32)  »”(q(up-w3))  -  -2o(wJ  -  Vj)  •"(Vj). 

Now  note,  from  the  definition  of  w^  and  since  wm  >y  w3  , 

wp  -  w3  ■  £  <(wP.w3)/^  I  |wP  -  w3|  I/viT 

i  1  ±-i  1  1  A 

Furthermore,  ±  follows  from  the  definition -of  and  a  that 

■Joy  >  «. 

Substituting  the  above  two  inequalities  into  (4.32)  yields 

Hj(q(wP-w3))  <_  -2 1  |wP-w3|  |  . 

Combining  this  with  (A. 28)  thus  yields 

(A. 33)  ^(mCwP)  -  m(w3))  - 1  IwP-w3 1 1  <  0  . 

However,  what  we  have  just  shown  is  that  for  every  j  *  l,...,k 

1  2 

there  is  a  p  *  p(j)  satisfying  (4.33).  Consequently,  for  every  w  ,w  e  N 
1  2 

with  w  2  w  c  N,  it  follows  that 

12  12 
m(w  £  w  )  >>  (m(w  )  2  m(w  ))  , 

from  which  it  follows  that 

(A. 34)  F(m(w’)  s  m(w2))  _<  F(m(w1  *  w2)) 

■  G(w^  2  w2)  . 


Combining  this  with  (4.30)  we  find  that  the  above  inequality  holds  for 

all  w1,  w2£N'.  where  N*  c  N  is  a  neighbourhood  of  zero  such  that  w1,w2eN 
1  2 

implies  w  «  w  cN,  and,  consequently,  that  (4.34)  holds. 

To  obtain  the  final  comparison  between  F  and  G,  we  need  to 
return  to  the  original  coordinate  system.  However,  this  is  now  easy, 
for  since  the  "minimum"  relationship  in  (4.34)  is  really  that  of  the 
original  coordinate  system  it  trivially  follows  that  via  m  we  have  con¬ 
structed  a  map,  say  m*,  from  some  neighbourhood  N*  of  y*  to  a  neigh¬ 
bourhood  m*(N*)  of  x*  satisfying 


F(m*(y))  =  G(y)  ,  y e N*  , 

F(m*(y  )  a  m*(y2))  =  G(y1  A  y2) ,  y^Sy2  eN*  . 

Slepian’s  inequality  now  yields 

(4.35)  P {sup  W  (x)  >  A}  _>  P{sup  W  (x)  >  X) 
xk  m(N*) 

>.  P {sup  W  (y)  >  X} 

N*  G 

_>  P (sup  W  (y)  >  X}, 

B« 

where  6  is  chosen  small  enough  so  that  B  :  =  [  (*£)^^-5,  (*s)^^+fi]^  c  N*  . 
(Note  that  6  depends  on  N*,  and  so  on  F.)  Thus,  to  complete  the  proof, 
we  need  only  find  a  lower  bound  for  the  last  probability. 

k 

Take  e  =  ^(26)  ,  and  consider  the  d.f.  F  of  Section  2  on  A  . 

e  e 

Map  yeB^  to  zeA^  according  to  the  coordinate  mappings 

z±  -  *±(y)  -  h  +  (26)k~1[y1-(h)1/k]  . 


Then  it  is  straightforward  to  check  that  for  yeB. 

0 

(4.36)  Fg(z(y))  -  G(y)  . 

12  12 

Furthermore,  if  y  ,y  then  y  a  y  eBfi,  and 

(4.37)  z(y1  *  y2)  »  z(y1)  a  z(y2)  . 

Consequently, 

P (sup  W  (y)  >  X}  =  P{supF  (z)  >  X). 

B«  Ae  ' 

But  this  last  probability  is  known,  and  is  bounded  from  below  in 
2(k-l)  -2X2 

Theorem  2.2  by  cX  ye  .  Combining  this  with  (4.36)  completes 


the  proof  of  the  theorem. 
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5.  AN  UPPER-LOWER  CLASS  THEOREM 


We  now  return  to  the  one-sided  KS  statistic  Tv  1  of  the  intro- 

n 

duction,  and  study  the  way  it  grows  with  n.  In  a  fundamental  paper 
treating  the  one-dimensional  case  Chung  (1949)  proved  the  following  result 
for  a  sequence  A(n)  +  “  : 


(5.1)  P{T 


(1)  > 


X(n)  infinitely  often)  =  0(1) 


(5.2) 


E  i2x2C»)  <»(  =  »)  . 

n 

n 


Kiefer  (1961)  obtained  a  weaker  version  of -Chung’s  result  for  the  multi¬ 
variate  case,  and  proved  the  following  LIL  for  every  k  and  continuous 


T 

(^•3)  i  b  p{lim  sup  - — 


=  1}  *  P{lim  inf 


n  •*  »  (h  log  log  n) 


n  -*■  »  (Jslog  log  n) 


Kiefer's  proof  of  (5.3)  was  based  on  inequa1ity  (1.3),  which  is  not  fine 
enough  to  pick  up  the  higher  iterated  logarithm  terms  that  (5.2)  yields. 
Having  improved  on  Kiefer's  inequality  in  the  previous  sections  (at  least 
Insofar  as  the  limit  process  W^,  is  concerned)  we  can  now  complete  the 
task  Kiefer  began  and  obtain  a  multi-dimensional  analogue  of  (5.1). 

Unlike  Chung's  and  Kiefer's  basic  inequalities  for  P{Tn  >  X),  we 

have  only  inequalities  for  P{sup  Wj,(x)  >  X),  and  so  we  shall  need  to 

x  k 

proceed  via  an  embeading  theorem.  To  this  end,  for  continuous  F  on  I 

define  the  Kiefer  process  as  the  C[0,l]^-valued,  real  parameter  process 

Kt»  t  j>  0,  satisfying: 


y.vN-.- 

•.’.•/.On 
•*"  -» 
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a  . 

(5.4) 

P{K,eA)  “  P(W  eA). 

1  r 

• 

(5.5) 

P{(K-K  )eA)  »  P{K  eA}  =  Pfv't^T  K-eA),  for  t  >  s  , 
t  s  t**s  1 

i . 

(5.6) 

(K  -K  )  and  K  are  independent  for  all  t  >  s  >  u  . 

t  8  U  — 

Here 

A  is  any  Borel  subset  of  c[0,l]  ,  with  topology  generated 

by  the 

sup  norm  ||k||  =  sup{  |  k(t) | :  telk>.  Then  Theorem  7.1  of  Dudley 

* 

and  Philipp  (1983)  implies  the  following  embedding  theorem,  which  is  a 
strengthening  of  an  earlier  result  of  Kiefer  (1972)  . 

Theorem  5.1  (Dudley-Phillip)  Let  be  an  Infinite  sequence  of 

*  00  00  00 

i.i.d. r.v. ’s,  defined  on  an  infinite  product  space  (R  ,  B  ,  P  ) 

with  common  d.f.  F.  Let  (fl,  E,  Pr)  be  the  product  of  (R*°,  B*0.?00)  and  a 

copy  of  the  unit  interval  with  Lebesque  measure.  Let  Fq  be  the  empirical 

d. f .  based  on  X^ . X^  .  Then,  for  every  0  >  0  there  exists  a  Kiefer 

process  Kfc,  t  >_  0,  defined  on  fl,  such  that 

(5.7)  sup|n[Fn(x)  -  F(x)]  -  K  (x)  |  <,  Ofo^log  n)”6) 
x 

with  Pr  probability  one. 

As  an  immediate  consequence  of  this  result,  along  with  a  LIL  for 
sums  of  Banach  space  random  variables,  it  is  now  easy  to  obtain  Kiefer's 
LIL,  (5.3).  (c.f.  Kuelbs  and  Philipp  (1980)  and  Goodman,  Kuelbs  and 
Zinn  (1981),  esp.  Theorem  6.1).  Indeed,  the  Banach  space  results 
yield  much  more  than  (5.3),  for  they  also  identify  the  cluster  points  of 
Fq  in  C[0,l]k  in  terms  of  the  unit  ball  of  a  certain  Hilbert  space.  It 
is  not  possible,  however,  to  follow  this  path  to  obtain  a  multivariate 


-9- 


• 


****** 


•.*  .• 


version  of  Chung’s  upper-lower  class  theorem,  the  problem  being  that  no 
appropriate  upper-lower  class  theorem  is  known  for  .  (Note  that 
whereas  Kuelbs  (1975)  does  have  a  result  of  this  type  for  Kt>  it  is 
not  applicable  here,  since  it  gives  results  not  for  the  growth  of  ||Kt|| 
but  the  growth  of  || K  I  I *»  where  1|.||*  is  another  unspecified  norm 
(albeit  equivalent  to  the  sup  norm).)  Consequently  we  shall  have  to  revert 
to  an  almost  basic  principles  analysis  to  obtain  a  generalization  of 
Chung’s  theorem. 

To  state  our  result,  we  shall  say  a  non-negative,  non-decreasing, 

continuous  function  ijj(t)  defined  for  large  values  of  t  is  a  lower 

function  for  {K  ,  t  j>  0}  if 

t  • 

(5.8)  P(  |  |k^ |  |  ip(n)  for  an  unbounded  set  of  n's)  *  1, 

and  an  upper  function  for  (Kfc:  t  ^  0)  if 

(5.9)  P 1 1 1 {  (  >  t^^t)  for  only  abounded  set  of  t’s)  =  1. 

Since  the  definition  of  Kfc  is  dependent  on  F,  whether  or  not  any  given 
i|>  is  a  lower  or  upper  function  depends  on  F  as  well  as  i|>.  Thus  we 
write  ^l(f)  and  'l'ell(F) ,  respectively,  to  denote  this  dependency. 

Note  that  (5.8)  implies  the  weaker  condition, 

P{||Kt|!  >  tS(t)  for  an  unbounded  set  of  t’s)  ■  1 

which  is  usually  taken  as  the  definition  of  a  lower  function.  However, 
the  stronger  result  (5.8)  is  what  is  needed  to  apply  Theorem  (5.1), 
and  since  our  proof  will  be  strong  enough  to  prove  (5.8)  we  use  it  to 
define  the  notion  of  lower  class.  We  can  now  state 


1c 

Theorem  5.2  Let  F  be  continuous  on  I  with  uniform  marginals.  For 
ip  as  above  ,  set 

(510)  IfcW  “  7*--^  e~2*2(t)dt  . 

If  Ik(\|0  <  then  tfieli(F).  Furthermore,  If  F  satisfies  the  conditions 
of  Theorem  4.2,  and  X^C^)  =  “  >  then  tyeL(F). 

A  simple  argument,  dating  back  at  least  to  Erdos  (1942)  and  spelled 
out  in  detail  in  Sirao  (1959),  shows  that  there  is  no  loss  of  generality 
in  Theorem  5.2  in  assuming  that  for  large  t 

(5.11)  Oclog  log  t )**  £  ij>(t)  i  O-og  log  t)*5  . 

Furthermore,  a  straightforward  application  of  the  Abel-Dini  theorem  easily 
yields  the  following  corollary. 

Corollary  5.1  Let  p  ±  3  be  integral,  and  define 

jW:  B  2  ^  [log2t+(k+l)log3t+log4t+. .  .+(l+6)logp+1(t)]  **  . 

Then  6  >  0  implies  Ik(i|»k  5)  <  “  and  6  <  0  implies  1^(1^  ^)  mce  , 
so  that  i|),  eU(F)  if.  5  >  0  and  i[i  el(F)  if  6  <  0  and  F  satisfies 

the  conditions  of  Theorem  4.2. 

As  a  further  consequence  of  (5.11)  and  Theorem  5.1  we  can  also 
derive  the  following  corollary  of  Theorem  5.2,  which  generalises  Chung's 


uni-variate  test: 


Corollaiy  5.2  For  all  F, 

eu\  tl^Cn)  -24,2(n) 

P{TW  >  *00  i.o.}  ■=  0  if  Z  e  < 


,2k. 


n 


For  F  satisfying  the  conditions  of  Theorem  4.2 


P{T  (k)  >  4-(n)  i.o.}  =  1  if  Z  ±_Jn)^-2ipzCn) 


n 


This  result,  of  course,  implies  Kiefer's  LIL,  (5.3).  All  that  now 
remains  is  the 


Proof  of  Theorem  5.2.  We  consider  the  convergent  case  first,  i.e.,  1^(4-) 


Define  a  sequence  t  satisfying 

n 


-2. 


<5-12>  W  Cn(1+*  (tn»  • 


where  t^  >  3  is  sufficiently  large  so  that  (5.11)  holds  for  t  >  t^. 


and  so  lim  t  =  «.  Set  I  ■  [t  ,t  .]  and 
n  n  n  n-ri 

n 


K- 


Aq  *>{sup 


>  1 


tEIn  t  >(t) 


Then,  applying  the  Eanach  space  version  of  Levy's  inequality,  we  have 


P{An}  <  P{sup||Kt||  >  t^(tn)} 


tel 


<  2P(I|Kt  ||  >  t^(tn) } 


n+1 


2P{t5lHKt+  I'  >  Wl)^(tn»’ 

n+1 


Now  apply  the  scaling  law  (5.5),  and  Theorem.  4.1  to  obtsin 


P(A,}  * 


1  C[Ktn)]2(k"1)exp[-2'l'2(tn)]  , 

since  t  <  t  -  and  (t  /t  .)  ■  (l+’i'  2(t  ))  ^  >  l-\l>  2(t  )  >  5j  for 
n  n+i  n  n+i  n  n 

large  enough  n. 

oo 

To  complete  the  proof  it  is  clearly  sufficient  to  show  IP  {A  > 

1  n 

converges.  But 

”<V  i”[‘  U(t„)]2(k~1)e~2'|,2(t.,)  <c\  ,  ds 

n“l  n  t  ,  n  n-1 

n-i 

<  C  if”  [«t  )]2(k-1)e-  *«■>>  t4f-t"~1>  ds 


<  C  /  *2kCa)e-2*2<s>d8 
0 


v*>  ■ 


the  last  inequality  from  the  definition  of  t^  and  the  ultimate  mono- 
2k  -2^2(s) 

tonicity  of  ^  (s)e  .  By  assumption  1^(40  <  *  ,  and  so  the  proof 

of  the  convergent  part  of  the  theorem  is  complete. 

2 

Now  assume  I.  (i|>)  **  “  .  Let  a  **  (log  n)  , 
k  n 

n 

0  =  it  a  ,  and  t  =  [j3n] ,  n  >  2.  Also  set 

n  i«2  1  n  ” 


9 


Following  Chung’ 8  (1949)  argument,  set 


Then,  if 


That  is, 
and  H 

m 


That  is 


Clearly, 

theorem. 


H  =  {[  Ik  1 1  <  tfyt  )}, 


n  n 


Hn,n+1  *  ('|Kt  +,  '  Kt  11  >  “V'Vl-'.'V1 
n+1  n 


both  H  and  H_  occur,  we  have 

n  n,n+l 


\  .J1  -  -  *»♦“«> 


n+1 


tn+l'|i(tn+l)  (  (1'tn',tn+l),,<'1+En) 


Hn-Hn,n+1_>  Hm+1  '  1’"“'  notlI,«  from  (5‘6>  th,t  H„,„+l 

are  Independent  for  m  _<  n,  it  follows  that 


n  n 

P{  w  H  >  •  rfe  )  <  P(  *  H 
w*  "  n>n+1  "  wi  " 


n+1 

P{  v  H 
bp*2 


}  <  p{  IT  H  } 
m  —  .  m 

m-2 


(1-p(Ea.,+l» 


<  P(»2>- 


n 


TT  (1-P{H 
m=2 


m,m+l 


}) 


then,  if  we  can  show  E  P{H 

n-2  “•n+1 

Applying  Theorem  4.2  and  (5.5)  we  have 


we  will  have  proven  the 
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<5‘13)  P{Hn,n+l}  -  C(1+£n) 


2  (k-1)  [iKtn+1)  ]2  (k-1)  exp  {-2  *2  (t^,  x  (l+e_  )  2  } 


n+1) n; 


Consider  the  exponent,  and  note  that 


1 -  <  __n_  <  - <  — i-  +  o(l/6  a  ) 


—  -  -  -  _  .  ^  , - r-  ^ 

ft  g  _  f-  —  a  —8  —  a 

n+l  n+1  n+1  n+1  n  n+1 


n  n+1 


Consequently 


<!«/  -  1  +  3/«n+1  +  OCT^), 


so  that,  by  (5.11), 


*2(tn+l>d+en)2  ~  +  (log  l0g  ^n+l5  ’  4/ai 


n+1 


n+1  n+1 


n+1 


log  ^  log2m 


n+1  4  log2 (n+1) 


*2<W  +  l- 


for  large  enough  n.  Substituting  into  (5.13)  and  setting  Yn**(on+^) 
yields 

EPtH  }>CE  [<Kt)]2(k"1)e'2l,'2(tn) 
n  n»n+1  n=3  n 


1+1/n 


>  CE  /ttlUtu)f»-1V2*Hc n>  * 


“n+1  r.,  n.  n2k  -2i|>2(sn)  a 

>  c  e  f  LMsJLl — 2 — - * - 

n  a  8  ^(6  )(.Y  -a  ) 


ds 


n  n  n' 


since  ^2k(t)e  ^  is  eventually  decreasing  in  t.  A  change  of  variables 


leads  to 
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n+1 


,  n+1  r„.,.  xl2k 

ip(h  ,,}>ci.  /  - dt, 

B  n'n+1  “  »  n  <«  )n  fc 


where 


(5.14)  a  - 


n  n**(BB)  (Y„-a  ) 

n  n  n 


If  we  can  now  show  that  a  is  bounded  away  from  zero  for  large  enough 


n  ,  we  shall  have  EP{H  }  >  Cl.  (ijj)  *  “,  and  the  proof  will  be  complete, 

n  ,n+l  k 


Firstly,  note  that  by  (5.11) 


(5.15)  ^2(G  )  log  log  (  *  log^)  =  log  (  l  log  (log2m)) 


m=2 


m=2 


<  log  ((n-2)21og  log  n) 

<  2  log  n. 


Furthermore 


(5.16) 


«  /Of  -®>  >  a  /(a  -a  ) 

n  n.  n  n  n+i  n 


r(log  n+1)2 
L  (log  n)2 


>  _  i  I"1 

-  L  log  n  A  J  * 

Substituting  (5.15)  and  (5.16)  into  (5.14)  yields 
*n  >,  \  [n(log(n  +  1)-  log  n)]  1 


■  k  [n  log  (1+  1/n)]”1 

>  1/8, 


which  completes  the  proof  of  the  theorem, 
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